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Abstract 

Sensor Networks are inherently complex networks, and many of 
their associated problems require analysis of some of their global char- 
acteristics. These are primarily affected by the topology of the net- 
work. We present in this paper, a general framework for a topological 
analysis of a network, and develop distributed algorithms in a gen- 
eralized combinatorial setting in order to solve two seemingly unre- 
lated problems, 1) Coverage hole detection and Localization and 2) 
Worm hole attack detection and Localization. We also note these 
solutions remain coordinate free as no priori localization information 
of the nodes is assumed. For the coverage hole problem, we follow 
a "divide and conquer approach", by strategically dissecting the net- 
work so that the overall topology is preserved, while efficiently pursu- 
ing the detection and localization of failures. The detection of holes, 
is enabled by first attributing a combinatorial object called a "Rips 
Complex" to each network segment, and by subsequently checking the 
existence/non-existence of holes by way of triviality of the first homol- 
ogy class of this complex. Our estimate exponentially approaches the 
location of potential holes with each iteration, yielding a very fast 
convergence coupled with optimal usage of valuable resources such as 
power and memory. We then show a simple extension of the above 
problem to address a well known problem in networks, namely the 
localization of a worm hole attack. We demonstrate the effectiveness 
of the presented algorithm with several substantiating examples. 

1 Introduction 

The infrastructure of computing systems is rapidly transitioning from cen- 
tralized systems to distributed and pervasive systems. A very important 
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class of such systems are sensor networks which find apphcations in areas in- 
cluding Environmental monitoring, Health care and Military operations [T]. 
There has been a considerable research interest in this field over the past 
decade, addressing problems including node localization [5], distributed com- 
pression [20], probabilistic inference |3l] and motion tracking. A unifying 
theme of many of these problems is to glean consensus information by sys- 
tematically combining the data collected at individual nodes, in accordance 
to the structure of the network. The consensus information thus obtained 
characterizes the network, or the data in the network as a whole, and bet- 
ter represents the underlying phenomenon which can be inferred from the 
data at individual nodes. This reveals the fundamental nature of sensor net- 
works: they are essentially complex networks in which global patterns emerge 
from simple interactions between nodes. From an engineering perspective, 
the fundamental challenge in sensor network applications is to cope with the 
limited resources; a limited communication capability of nodes, i.e. nodes can 
only communicate with their neighbors, with a limited power and a limited 
memory. Furthermore, sensor networks are often deployed in unaccessible 
locations and environments where maintenance is impractical; this makes 
careful use of exhaustible resources such as power, imperative. 
This unique set of circumstances motivates the use of techniques such as 
topological analysis. This is to directly extract global information without 
being overly dependent on the local structure, and thereby alleviating the 
excessive need for recourses. We demonstrate in this paper, the merits of 
such analysis by exploiting tools to solve two specific important problems: 
1)A Coverage Hole detection and localization and 2)A Worm-Hole Attack 
detection and localization. 

The first Problem discussed in Section H] seeks to identify an area within 
a network which is not in the range (and hence uncovered) of any sensor. 
The second problem investigates the detection and localization of an attack 
called a worm-hole. This has a potential of substantially disrupting routing, 
localization and other tasks in a network. In the next section, we endeavor 
to briefiy summarize the research in topological analysis and work related to 
the techniques presented in this paper. 

1.1 Topological Analysis in Sensor Networks 

Distributed algorithms for analyzing topological properties may be broadly 
classified into three categories: Geometric, Topological, and Statistical Meth- 
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ods. This categorization is based on the taxonomy presented in [52], and a 
good overview of algorithms in these areas is presented in [TT] . 
Topology may be described as the study of arrangement of spaces (manifolds 
or other data spaces) whereas Geometry may be described as the study of 
metrics (measures of distance) on these spaces. This distinction characterizes 
the difference between distributed algorithms using topological and geomet- 
ric methods. In sensor networks, the space of interest is first constructed 
using the node parameters followed by an analysis methodology of choice 
(For example, in the coverage problem in Section HJ the space of interest is 
the total coverage area). We may also view the Geometric methods as a 
"fine" analysis of spaces whereas Topological methods as a "coarse" analysis. 
Statistical methods rely on the aggregate statistical behavior of node param- 
eters and try to infer the necessary information of the network by tracing 
the changes in these statistics. Our present work falls into the category of a 
Topological approach. The distinction of these methods can be instantiated 
by looking at existing algorithms to solve the coverage problem. 
In geometric methods, for example, the work in [TT] computes an a-hull of 
the node positions in order to identify the outer and inner (coverage-hole) 
boundaries of a network. An a-huU of a set of points V in a plane is given by 
the intersection of complementary regions of circles of radius such that 
no point in V lies inside these circles. The complement of a circle is defined 
as the entire plane excluding the interior of this circle. Some other geometric 
methods for the coverage problem are presented in [TD] and [TT]. 
An example of a statistical approach for a coverage problem may be found 
in [28]. It relies on the idea that nodes close to network boundaries, have 
fewer incident edges in the network graph than internal nodes. The authors 
use statistical methods to derive suitable thresholds to separate edge nodes 
from internal nodes using the node degrees. In [21], boundary nodes are 
separated from internal nodes by using a centrality measure which counts 
the number of shortest paths that pass through a node. A higher centrality 
value occurs among internal nodes. 

In the Topological methodology, Morse theory and Algebraic Topology are 
the most commonly used tools. 

1.1.1 Morse Theoretic Methods 

Morse theoretic methods analyze the topology of a given topological space, 
more specifically a manifold, by studying differentiable functions defined on 
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Figure 1: A high level schematic of Algebraic Topology 

it. Consider a differentiable function / : M — )■ M defined on a manifold M, 
then the inverse image of a point in M is called a level set. A principal tenet 
underlying these methods is the observation that critical points of this func- 
tion, where topology of the level sets changes, directly reflect the underlying 
topological construction of the space. 

For the coverage problem, an example of a Morse theoretic topological method 
is given in [12]. The authors find boundaries of a network by studying the 
behavior of connected components, the nodes of which are at equi-hop dis- 
tance from a randomly selected point in the network. The main observation 
here is that each of these components has a discontinuity at the boundaries 
of the network. 

Morse theoretic methods often provide simple and efficient (in complex- 
ity) methods to analyze the underlying topology, but the greatest challenge 
of these methods is often the construction of an appropriate function. In ad- 
dition, since the theory is mostly developed for manifolds, it requires stricter 
assumptions for discrete spaces such as positions of nodes in a network. For 
example, the work presented in [12] will fail to provide reasonable results if 
the node density is small, or their distribution is non-uniform. 

1.1.2 An Algebraic Topological Approach 

Algebraic topology, in contrast to a Morse theory, is a relatively more direct 
technique to analyze the topology of a space which is easily expressed in 
terms of algebraic objects. There is an extensive literature in Algebraic 
topology [T6l[3T] which shows a very strong relationship between topological 
spaces and their algebraic counterparts. This also enables us to draw from 
an extensive source of knowledge in algebra to develop fast and efficient 
algorithms. The algebraic objects of choice have the following important 
properties: They directly reflect the topological features of an underlying 
space, and they are invariant to continuous deformations. We therefore follow 
this approach in our work owing to these advantages. 
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The use of algebraic topology for the coverage problem mentioned above, 
was first introduced in P,|T3|. The works in propose a distributed 

computation of homology groups, and |30j attempts to localize the holes by 
formulating localization as an optimization problem. We further exploit the 
natural spatial constraint of the coverage holes, to formulate a new effective 
and efficient "divide and conquer" algorithm. To the best of our knowledge, 
[50] is the first attempt at distributively localizing holes using an algebraic 
topological approach and we compare our work on coverage holes with the 
results presented in there. A preliminary version of this paper was presented 
in [7]. 

A hole may be caused by a) deployment error, b) some catastrophic event 
such as an explosion, c) presence of a jammer which disables associated nodes' 
communication, and thereby hiding their presence within the network. In 
case of deployment error, a hole localization helps in targeted redeployment, 
and more generally, helps in precisely identifying the location of the cause 
of failure events. Routing algorithms such as geographic-routing [27] and 
some other distributed signal processing algorithms heavily depend on certain 
assumptions about the topology of the network of interest [2S]. Having this 
knowledge of the overall topology may therefore be very useful. 
Other network failures, which may have a devastating impact, include worm 
holes. A worm hole attack is typically launched by two colluding external 
attackers who do not authenticate themselves as legitimate nodes to the 
network. When initiating a wormhole attack, an attacker overhears packets 
in one part of the network, tunnels them through the wormhole link (external 
to the network) to another part of the network. This effectively generates a 
false scenario of the presence of the original sender in the neighborhood of 
the remote location. An illustration of a worm- hole is given in Figure ??. 

Many routing algorithms depend on the nodes' ability to accurately dis- 
cover their neighboring nodes. The nodes ordinarily perform a broadcasting 
beacons (including ID, and other information) to their neighbors. If the 
neighbor discovery beacons are tunneled through wormholes, the good nodes 
will get false information about their route. Although finding faulty routes 
is in itself a problem, worm holes can cause further critical security threats 
using these faulty routes. The resulting effect of wormholes on the routing 
is to include a worm hole link in most of the computed routes. This in turn, 
gives an attacker complete control of transmitting great amounts of data, 
which may be selectively or completely dropped. Impacts of a wormhole on 
a route discovery procedure in a sensor network have been studied at length 
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in [niiig. 

In the absence of known coordinates, as that provided for example by 
GPS, nodes in a sensor network depend on the positions of their neighbors to 
triangulate their own positions. A limited deployment of a hardware (GPS) 
over a few nodes would, on the other hand, be sufficient for the entirety of the 
nodes to compute their positions. Much work has been done on distributed 
localization in sensor networks [SlEI], and the predominant approach relies 
on strong correlation of geographic vicinity and communication capability 
of the nodes. Note that wormholes distort such correlation, and will hence 
adversely affect the localization algorithms. A study of impact of wormholes 
on localization procedures can be found in [19] . In light of the serious impact 
worm holes may have on a sensor network, we propose to also naturally adapt 
the strategy we proposed for analyzing coverage problem to not only detect 
but also localize these failures. 

1.2 Paper Organization 

The balance of the paper is organized as follows. In Section |51 we formalize 
both the coverage hole and the worm hole problem by a precise mathematical 
formulation. We subsequently provide the fundamental mathematical back- 
ground necessary for topological analysis in Section |3l We provide a detailed 
discussion of our algorithm to localize the Coverage Hole in Section HI and 
describe its natural adaptation to the problem of the worm hole attack in 
Section [51 We conclude with some remarks in Section [61 

2 Formalization of topological network failures 
2.1 Coverage Problem 

We consider the scenario where N sensor nodes are randomly deployed in 
a region of interest. We denote the collection of all the nodes as the set 
V = {vi}. Each node Vi can communicate with all the nodes within a circu- 
lar neighborhood i?* of radius r*, and we denote these nodes as the set N{vi), 
the neighbors of Vi. A communication graph G = (V, E) is thus formed as 
the collection of the set V together with the set of edges E = {{vi,Vj)} where 
{vi,Vj) G E, if and only if Vi, vj can communicate with each other. The cov- 
erage area of a sensor at each node is assumed to be a circular neighborhood 
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R\ or radius centered at the node f j. Let 3? denote the union of areas en- 
closed by the outermost boundaries of connected components of the network. 
The objective is to ensure that the following relation holds 

3ftc|Ji?^ = i?„ (1) 

i 

where Rc is the total coverage space. This also highlights our interest in 3fJ 
being completely covered by the coverage areas of the sensors. The outermost 
boundary of a sensor network is to some extent in the control of the deployer, 
and there are algorithms which can detect this boundary [33]. As Equation 
([1]) suggests, we are therefore mainly interested in the coverage of the region 
"inside" the network. Furthermore, if the relation ([1]) does not hold, our 
goal is to find the nodes which are closest to the boundary 9(3? \ Rc) of the 



uncovered region. As an illustration of this problem. Figure 11(a) shows a 
network with its communication graph and coverage area (the shaded re- 
gion). The region of interest is the interior of the outermost boundary of the 
network. Since a part of this region is not covered by any sensor, we seek 
the smallest cycle in the network surrounding this coverage hole. We assume 
the following: 

1. Let Q be a clique in G, then 

conv{Q) C y i?^, (2) 

i.e., for any given clique Q in the communication graph, the convex hull 
{conv{Q)) of the nodes is completely covered. This assumption serves 
to characterize the coverage area using the communication graph, as 
further discussed in Section [3J This can be ensured by requiring the 
relationship between the sensor coverage radius and the communication 
radius as > Note that this assumption is not restrictive as the 
antenna power and hence the communication radius, may be altered 
in order to extract the appropriate graph. This may be seen in Figure 



11(a), where for any clique (for example, all the triangles), the interior 



is completely covered. 

2. The nodes have no localization information. 

3. There is no direction information, i.e., the nodes are unaware of the 
relative orientation of their neighbors. 
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4. The nodes are not necessarily uniformly distributed in a given region 
of interest. 

2.2 Worm Hole Problem 

A worm-hole attack is typically launched by two colluding nodes at positions 
Pi and p2 inside a network. Denote the neighborhood regions around these 
points by A^i and N2. The two attacking nodes may receive all the packets 
transmitted from within their respective neighborhoods, and relay them to 
the other. Denote by Vi and V2 the sets of vertices (sensor nodes) which lie in 
A^^i and N2 respectively. The result of a worm-hole attack will be to produce 
a complete bi-partite graph with Vi and V2 as the two classes of vertices. 
The problem of localizing a worm hole attack, hence reduces to identifying 
the sets Vi and V2. In addition to all the above assumptions pertaining to 
the coverage problem, we will also assume the following: 

1. The positions pi and p2 are sufficiently far apart from each other inside 
the network. This assumption is based on expected topology change 
due to a wormhole and assumed detectability (this is possible only if 

NinN2 = 4>). 

2. The positions pi and p2 are not very close to the outer boundary of the 
network. 

3. The distribution of the nodes is sufficiently dense such that a deletion 
of a node in the network will not cause a significant change in the path 
lengths. This assumption is not necessary for detecting a worm hole, 
but is important for localizing the neighborhoods A^^i and A^2- 

Figure ?? shows an example of a worm hole attack. In this case, X and 
Y are the positions pi and p2 and the neighborhoods A and B are Ni and N2 
according to our definition. Note that in the network shown, Nir\ N2 = 0, 
which will enable us to detect the attack. The assumptions 2 and 3 are 
however, not valid as pi and p2 are close to the outermost boundary which 
violates assumption 2, and there is a bottleneck in the network which violates 
assumption 3. The algorithm presented in Section |5] will cause some false 
alarms in this case. We further elaborate on this in Section El and show 
some examples where we can accurately localize the attack. 
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3 Framework 



The current state of research invokes areas from Mathematics such as Topol- 
ogy, Homological Algebra, Engineering and Computer science (eg. gossip al- 
gorithms in Sensor networks and Graph theory). In this section, we construct 
a suitable framework for our algorithm, by introducing the necessary mathe- 
matical and computational tools. While the available literature is extensive, 
we focus only on the important concepts which are central and sufficient to 
elucidate the implications of our algorithm. 

3.1 Topological Analysis 

Topological analysis [25] can loosely be construed as the study of global or- 
ganization of spaces without paying much heed to fine geometrical structure. 
For a space embedded in R'^, this amounts to analyzing properties such as, 
"is the space connected?", "does the space wrap upon itself?", "does the 
surface have any holes?" or "does the surface enclose a three dimensional 
void?" and so on. As such, the developed tools provide the proper gener- 
alization to study organizational features of a network, without expending 
resources on finer details. This generality is concisely captured in the notions 
of homotopic mappings and homotopy equivalent spaces, which are defined 
as follows: 

Definition Let X and Y be two spaces. Two maps /i, /2 : X — > F are said 
to be homotopic (/i ^ j^) to each other if 3 a continuous map F : X x I ^ 
Y (where / = [0,1]) such that F{s,0) = fi{s) and F(s, 1) = /2(s). Such a 
function F is called a Homotopy between fi and f2- 

Definition Two spaces X and Y are said to be Homotopy Equivalent if 
3 continuous maps / : X — > F and g : Y ^ X such that f o g ^ id and 
g o f ^ id. Such a map / is called a homotopy equivalence. 

The above definition means that if two spaces X and Y are homotopy 
equivalent, then one can be continuously deformed into the other, and they 
both have the same topological features. A remarkable result from algebraic 
topology is that the homology spaces which we compute (described in Section 
13. 3p . are invariant to homotopic mappings. This is what enables us to treat a 
relatively large class of spaces in a unified framework without the costly and 
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valuable resources required for considering their exact geometry. The compu- 
tation of homology spaces does not depend on the localization information of 
nodes. Figure |2] shows two homotopy equivalent spaces which may be viewed 
as coverage areas of two different sensor network. Although, their geome- 
try (the distribution of distances between points) is quite different (this also 
reflects the location of the nodes), they have the same topological features. 
We can further exploit this invariance to get homotopy equivalent represen- 




Figure 2: Figure showing two homotopy equivalent spaces 

tations of these spaces, using simple building blocks which, as will be seen, 
are simple to manipulate. These representations are called simplicial com- 
plexes. The simple building blocks are called simplices (simple pieces). The 
dimension of a simplex is represented by its order. Simplicial Complexes are 
representations of given topological spaces using simplices (simple pieces). A 
standard 0-simplex is just a point, a standard 1-simplex is a line segment, a 
2-simplex a triangle and so on. A k^^ order simplex or fc-simplex a'' is the set 
of all points given by the convex combination of A; + 1 linearly-independent 
points, a'' = (t>o,...ffc). Figure [3] shows simplices of order through 3, and 
Figure m shows an example of representing a topological space using a simpli- 
cial complex. Note that the topology is preserved in the simplicial complex 
representation. 

3.2 Combinatrics 

Owing to its simple representation, a simplicial complex may be abstracted 
into a combinatorial object. We can view this intermediate step as a tran- 
sition from topological spaces into algebra for computing homology spaces. 
As each simplex is uniquely determined by specifying its vertices, given the 
total set of vertices V = {fj}, a simplicial complex may then be abstractly 
specified as a collection of subsets Ej C P{V),j = 1,2,..., where PiV) is the 
power set, and each Ej is a collection of j— tuples from V representing the 
j — I simplices. Note that when we restrict j to the set {1, 2}, what we get is 
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(a) 0-Simplex (b) 1-Simplex (c) 2-Simplex 
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(d) S-Simplex 
Figure 3: Simplices of order 1 to 4 

a graph. Therefore, a simphcial complex may be viewed as a generahzation 
of a graph, and when it is homotopy equivalent to a space, it captures its 
topological properties. For a given k simplex [vq, . . . , Vk), we can also define 
an orientation by specifying the order of the vertices. We divide all possible 
permutations of these k + 1 points into two classes, the elements of each class 
may be transformed from one to another by interchanging adjacent vertices 
an even number of times, giving a simplex two possible orientations. The 
simplices in Figures [3] and H] show an orientation given to the simplices. 
To get a representation of a coverage area, we can use a particular type of 
a simplicial complex called the nerve complex [H]. Note that the coverage 
area is a union of convex sets. Given a collection of sets Rc = [JiRl , the 
nerve complex (or the cech complex) of Rc, Kn (Rc), is the abstract simpli- 
cial complex whose fc-simplices correspond to nonempty intersections of A; + 1 
distinct elements of Rc- An edge in (Rc) exists between two vertices if 
and only if the corresponding elements of Rc intersect. Higher dimensional 
simplices are regulated by mutual intersections of collections of elements of 
Rc. Among the many uses of nerves in topology, the following classical result 
is perhaps of greater importance in applications: 

Theorem 3.1 (The Cech Theorem): The nerve complex of a collection of 
convex sets has the homotopy type of the union of the sets. 



11 




(a) A topological space 




(b) Siniplicial Complex representing the topo- 
logical space 

Figure 4: Representation of a Topological space using a simplicial complex 

The implication of this theorem is that K]^ (Re) effectively captures the topol- 
ogy of Re- The computation of the nerve complex unfortunately requires 
localization information, and is very difficult even when we have it. We 
therefore rely on an approximate representation called the Vietoris-Rips (or 
Rips in short) complex, denoted by Kp [Re) which can be obtained from only 
the communication graph. For extracting the Rips complex, we simply say 
that each k clique in the communication graph is a A; — 1 simplex in the 
Rips Complex. Under assumption (1) in Section 12. ![ the Rips complex is 
a reasonable approximation of the underlying topological space in the sense 
that the number of false alarms and false negatives indicating coverage holes 
are very low in number. The reader is refered to [13] for examples where the 
Rips complex does not accurately represent the coverage area. We however 
maintain that this is not a limitation and as shown in [H], we can always 
represent Rc accurately using two Rips complexes using appropriate radius 
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of communications. In particular, for a coverage radius r^, the authors show 
that, if we have communication radii (strong and weak) ri and r2 such that 
2rc = Ti > \/2r2, then the rips complexes K^'^ (Re) and (Re) satisfy the 
following relation: 

k;' c Km {r,) c k;^ (i?e) (3) 

The above relation implies that the topology of Rc is completely captured by 
the two rips complexes. This is tantamount to using our algorithm twice. For 
the purpose of this paper, we will assume that the Rips complex obtained 
with condition (1) in Section 12.11 accurately represents the Coverage area. 
We will describe in the following section, an approach to infer the topological 
properties of Rc using Algebra on Kp (Rc). 

3.3 Homological Algebra 

In this section, we discuss some fundamental notions of homology spaces. We 
subsequently relate the algebraic structure to the combinatorial structure of 
the Rips Complex, and demonstrate the usefulness of these spaces in inferring 
the existence and the cardinality of coverage holes. 

Definition A Chain Complex {Ck,dk} is a sequence of vector spaces 
{Ck} together with linear operators {dk : Ck — )■ Ck~i} called the boundary 
operators, 

^ a ^ C„- ■ ■ • ^ _ 1 ■ ■ ■ ^ Co ^ 

with the boundary operators satisfying 

dk-i odk = OoTd^ = (4) 

The groups {Ck} are called chain spaces and their elements are called 
chains. 

The chain complex is fundamental to homological algebra, as it provides the 
structure where homology spaces may be defined. Note that since d^-iodk = 
0, it follows that the image of one boundary operator is a subset of the kernel 
(or null space) of the next boundary operator, i.e., 

Img{dk) C Ker{dk-^i) 

This observation enables us to define a homology space as follows. 
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Definition Given a chain complex C = {Ck, dk}, the k^^ homology group 
Hk{C) of tlie cliain complex is given as 

Hk{C) = ker{dk)/Img{dk+i) (5) 

i.e., the k^^ homology group is the quotient group formed by equivalent classes 
of elements in ker (dk), where the elements are considered equivalent if their 
difference lies in the subspace Img (dk+i). 

Of particular interest to us, is the first homology space Hi{C). We will 
hence work only with C2, Ci and Cq. We form the chain spaces C2,Ci and 
Cq by taking all the 2-simplices (triangles), 1-simplices (edges) and vertices 
respectively of the Rips Complex Kp (Re) as the basis vectors. The additive 
inverses in the chain spaces are given in terms of the orientation as: 

if = (vq, ...,Vi, Vi+i, ...,Vk) then - a'' = {vq, . . . , Vi+i, Vi, . . . , Vk) (6) 

and the boundary operator is defined in terms of /c-dimensional simplices as: 

dk{vo, . . . , t;fc) = ^ -l*(^o, • • • , Vi-i, Vi+i, ...,Vk) (7) 

i 

It is simple to check that the boundary operator so defined satisfies Equation 
@, and we show this fact here by considering, for example, the action of 
di o on a two simplex {vq, Vi,V2)- 

9io92(t;o,^^l,^^2) = 5l((^^l,^^2)-(^^0,^^2) + (^^0,^^l))- = f2-fl-^^2+^^0+^l-^^0 = 

(8) 

Using the above definitions of boundary operators and chain spaces, we can 
form a chain complex C {Rc) using the combinatorial structure in the Rips 
Complex. 

In order to understand what homology groups tell us about the topological 
space, we should carefully look at the action of the boundary operators. Let 
us look at the null space (kernel) of di. Consider a cycle c = Ci + 62 + 63 + 63 
as shown in Figure ([5]) which is homotopic to a loop. The action of di is 
given as: 

<9l(c) = V2 - Vi + V3 - V2 + - V3 + Vi - V4, = 

This implies that the null space of di consists of all closed cycles (chains 
without boundaries). And as we saw in Equation [HI the boundaries of A; + 1- 
simplices are closed cycles in Ck, and they belong to ker{dk)- This means that 
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ker{di) also consists of closed cycles which are boundaries of 2-simplices. But 
we know that 2-simplices are homeomorphic to disks or any space without any 
holes in them. Therefore, if we remove all the cycles which are boundaries 
of 2-simplices, the cycles that remain are those circling a hole. From the 
definition of the homology group Hi (C (Re)) = ker{di)/Img{d2), it is clear 
that Hi counts the number of holes in our topological space. We now present 
an example to illustrate the basic mechanism of this procedure 




Figure 5: a chain in ker{di) 



3.3.1 Example 



Consider the simplicial complex X shown in Figure 4(b) The orientation 
of the simplices (1 and 2-dimensional) are arbitrarily chosen; the homology 
assigned spaces will be independent of this choice. Consider the 1-chains 
(paths): Ci, the outermost boundary, C2, the closed path enclosing the trian- 
gles and C3, the closed path enclosing the hole, all in clockwise orientations. 
These chains in terms of the basis vectors (the simplices) are expressed as 

ci=El-E6 + E12 + Ell + ElO + E9 + E2, 

C2 = E1-E6-E7 + E8 + E9-E2, C3 = E12 + Ell + ElO - E8 + E7. 

Note that Equation ([2D states that a change in the sign of a simplex changes 
its orientation. Using Equation ([7]) for the boundary operator, we can see 
that 

^1 (cs) = di {E12) + di{Ell) + di (ElO) - di {E8) + di {E7) = V7-VQ + V8- 
V7 + V5-V8~ {V5 -VA) + V6-V4: = 

Similarly, any closed path (including ci and C2) can be shown to belong to 
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ker{di). Again using Equation ([7]), the action of the 82 operator, on Tl for 
example, is given as 
d2{Tl) = E6 + E5 + E7. 

It should be easy to verify that C2 can be expressed as C2 = (92(T4 — T3 + T2 — 
Tl) and therefore, C2 G img{d2) Note also that ci — C3 = C2, i.e., ci and C3 
differ by a chain in img{d2) and are therefore, homologous. In other words, 
they encircle the same hole. To compute Hi[X), first observe that any closed 
path on X may be expressed as a sum of the closed paths surrounding the 
four triangles and, that surrounding the hole. Therefore, ker{di) is a vector 
space with 5 basis vectors, i.e., ker{di) = M^. Also, the four closed paths 
generated by the action of 82 on the four triangles (the basis vectors for C2) 
are linearly independent and therefore, img{d2) = IR^. From definition 
of a homology space as a quotient space, we can see that Hi (X) = M. The 
first homology group has one generator, corresponding to one hole in the 
complex. 

3.3.2 Laplacians 

As we saw in the above example, the computation of the dimension of the Hi 
(the first betti number) involves computing the ranks of the operators di and 
82- Such a task is computationally very expensive, and as we will show in 
Section 14. 2[ the precise rank of these operators is not necessary for detecting 
the existence of a hole. Laplacian operators provide an easier way to detect 
the triviality of the homology spaces. The graph laplacian from graph theory 
may be generalized to the case of simplicial complexes [22] as 

Definition Given a chain complex C, the k^^ Laplacian operator : Ck 
Ck is defined as 

Lk = dk+i o dl_^_^ + dlodk 
where dl is the adjoint of dk 

It may be shown [22] that the kernel of the laplacian operator is isomor- 
phic to the /c*'^ homology group, i.e., ker{Lk) = Hk{C), and we can use the 
laplacian operators to infer the topology of Kp{Rc). An important prop- 
erty of the Laplacian operators is that they are symmetric and non-negative 
definite. 
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3.4 Distributed Computation 



In this Section, we address issues central to sensor networks, chief among 
them the scahng of computation with the network size, and the implemen- 
tation of related mathematical tools. 

Owing to excessive cost of communication between nodes , gathering all 
the raw data at the nodes to a sink node is prohibitive. Whenever possible, 
distributive algorithms should be designed to reduce the demand for data col- 
lection. The power consumption during communication is in addition higher 
relative to that required for computations within the nodes, thus highlight- 
ing the importance for algorithms to reduce communication by with-in node 
computations. The use of positioning systems such as GPS or other localiza- 
tion algorithms, is also very expensive, emphasizing the use of localization 
information be avoided if at all possible. The algorithm we propose here 
satisfies all these basic requirements. 

An interesting class of distributed algorithms is that of gossip algorithms, 
where nodes process the data by passing messages amongst their neighbors. 
One particular gossip algorithm we exploit extensively, is the distributed 
computation of eigenvalues of sparse matrix in a network [18] using the or- 
thogonal (or power) iteration method. In particular, as described in Section 
\A.2\ we wish to compute the spectral radius of the first order laplacian Li 
of the Rips complex Kp{Rc). We can extract the Rips Complex from the 
communication graph, distributively compute Li and its spectral radius as 
described in [23]. We also develop some simple gossip algorithms and prove 
their efficacy in the following sections as and when required. 

4 Coverage Hole Localization 
4.1 Algorithm Overview 

In this section, we present a novel method to reduce the problem of locating 
a coverage hole in a network into one of detecting a hole using a "divide and 
conquer" mehtod. As seen in the previous section, the problem of detecting 
a hole in Rc reduces to checking whether the first homology space of the 
chain complex formed from Kp (Rc) is trivial, i.e., if Hi (Kp) = 0, there are 
no holes in the space. We subsequently and strategically divide the network 
into smaller partitions, and check for the presence of holes in each of these 
partitions, and "drop" the partitions where there are none. As we continue 
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this process, the partitions which survive, are the boundaries of the holes. 
The crux of this algorithm lies in the process of dividing the network in such 
a way so as to preserve the topology, i.e., neither create nor destroy holes. 
Table 14.11 presents an overview of this algorithm. 

for all partitions with non-trivial homology 
\\ Dissection 

step 1: Find the diameter nodes. 

step 2: Find the boundary nodes and construct the partitions. 
\\ Detect holes in each of the above two partitions 
for the two partitions constructed 

step 3: Compute the Laplacian Matrix L 

step 4: Check the rank deficiency of L 

repeat. 

Table 1: Hole Localization Overview 



4.2 Hole Detection 

In order to determine the number of holes in R^, we have to compute the 
dimension of ker{Li). If on the other hand, the detection of a hole is only 
of interest, we may check whether Li is rank deficient, or in other words, to 
check whether Li has a zero eigenvalue. To that end, we use the following 
theorem: 

Theorem 4.1 Let Li be a symmetric non-negative definite matrix with spec- 
trum (j{Li), and spectral radius p{Li). Then, Li is rank deficient if p{p{Li)I— 
Li) = p{L,) 

Proof Let x be an eigenvector corresponding to the eigenvalue A G cr(Li). 
Then (p(Li)/ — Li)x = (p(Li) — A)x. ^ x is also an eigenvector of (p(Li)/ — 
Li) and its eigenvalue is p{Li) — A. Furthermore, Li is non-negative definite 
^ A > ^ p{p{Li)I - Li) < p{Li) if Li is of full rank, then A > 
^p(p(Li)/-Li)<p(Li). I 

The spectral radius of Li can be computed using the power iteration 
method by searching for the largest eigenvalue which can be distributively 
carried out over the network [T8]. The convergence of the power iteration 
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method (for eigenvector) is slow when the difference between the largest and 
second largest eigenvalue is small, while the eigenvalue itself quickly con- 
verges to the true value. A false detection of a hole is possible when the 
smallest eigenvalue is very close to zero, but this problem is unlikely to hap- 
pen in successive partitions (partitions are explained in the next section). 
Each iteration in the power iteration method includes multiplying Li by a 
vector from the previous iteration, and normalizing the resulting vector. The 
sum of the squared elements of the vector (for normalization) can also be dis- 
tributively computed in the network by a gossip algorithm whose convergence 
time is of the order (nlog(n)) [3] [2]. 

4.3 Hole Localization 

Each element in the first homology space Hi represents an equivalence class 
of homologous closed paths encircling a hole in the coverage space. As such. 
Localizing the exact boundary of this hole is in essence a problem of finding 
the smallest closed path in an equivalence class. A very direct approach 
was proposed in [30], where the authors formulate the localization as an 
optimization problem to seek the sparsest chain in the Hi space. Such an 
approach is effective at the cost of a very slow convergence, and involves 
all the nodes in the network to participate in the optimization. While the 
presence of holes in a coverage space is a global property, the boundary of a 
hole is constrained to a relatively small part of the network. Any ability to 
detect a hole in this region in noway depends on the configuration of nodes in 
other parts of the network. We exploit this idea to reformulate the problem 
of identifying a boundary of a hole to a much simpler problem of detecting 
holes. 

We accomplish this reformulation by iteratively dissecting the network into 
two smaller partitions, and by detecting the presence of holes in these smaller 
partitions. All nodes in the partition where no hole is detected, go into a 
"s/eej>" mode and are taken out of the analysis, yielding a a valuable power 
saving. The remaining active nodes will form partitions with non-trivial 
homology (with holes in coverage), to get further dissected in pursuit of hole 
localization. We will thus be rapidly converging onto the exact boundary of 
the holes with each iteration. In the first iteration, each connected component 
of the network graph G is treated as a partition. The partitioning strategy 
is to minimize the "size" of the resulting partitions, while simultaneously 
preserving the overall topology. 
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4.3.1 Finding Diameter Nodes 

Firstly, we elaborate on what we mean by "size" in the above description. 
The time required to complete steps 1 and 2 above, directly depends on the 
diameter of the network partition. Step 4 utilizes a gossip algorithm whose 
convergence does depend on the diameter, with other factors possibly coming 
into play A network segmentation obtained by minimizing the diameter 
of the smaller partitions is therefore optimal for minimizing the overall run 
time. This is facilitated by identifying a pair of nodes called the diameter 
nodes defined by 

(m, v) = arg max d {vi,Vj) , (9) 

where d [vi, vj) is the shortest path between nodes Vi and vj (in terms of hop 
count) in the partition of interest. Such a pair will generally not be unique, 
and "ties" between nodes are broken by a simple protocol which chooses the 
pair that has the node with the smallest ID. 

We determine the boundary nodes in two stages; We first find the candidate 
nodes Cdia by assigning a scalar field f{vi) equal to the farthest distance for 
each node x in the current partition, and to ultimately select the nodes with 
the maximum /; we subsequently proceed to break the "ties" using the afore 
mentioned criterion, 

f{vi) = max d {vi,Vj) 
Cdia = {v\f{v) = max f{vj)}. (10) 

To compute / on G, we use a simplified version of the Dijkstra's algorithm. 
The simplification is a result of the following differences with Dijkstra's, a)we 
do not need the shortest paths but rather just the distances and b)Instead 
of shortest distance from a node Vi to all other nodes, we require max of 
distances. 

4.3.2 Computing the scalar field / 

A summary of the algorithm for computing / is given in Table In what 

follows, we provide an intuition into the mechanics of the algorithm followed 
by a mathematical justification. 

It immediately follows from Equation (fTOj) that, in order to compute f{x), 
it is sufficient for each node Vi to have the knowledge of d{vi,Vj) for all Vj 
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in G. Since fivi) has to be computed for all fj, the preceding statement 
may equivalently be stated as; for each f j, it is sufficient for all Vj (all other 
nodes) to know d{vi,Vj). We accomphsh this by broadcasting node fj's id 
in the network, and for each node Vj, d{vi,Vj) is equal to the number of 
hops taken by the first message arriving at vj. Note that, in order for fj's 
id to reach all other nodes (assuming a connected Graph), it is sufficient 
that any other node broadcasts this information to its neighbors only once, 
since re-broadcasting will provide no new information. This will result in 
reducing the number of required broadcasts. In order to ensure that no 
message (id) is re-broadcast, it is sufficient for each node to remember all the 
messages it previously transmitted (for example, by maintaining a table). 
The next theorem assuages this requirement by showing that it is sufficient 
for a node to remember all the messages only for a limited time. This reduces 
the memory requirement on the nodes. We now provide a mathematical 
justification for the above intuitive arguments. 

Denote a.s A = {ay}, the adjacency matrix for G, then A" = a^-, n > where 
a^j is the number of paths of length n from i to j For simplicity, we 

assume that i is the ID given to node f Now, the shortest distance from Vi 
to Vj, i j is given by 

d{vi,Vj) = argmin a"- > 0,i 7^ j (11) 

n>0 

The matrix A" can be distributively represented in the network where node 
Vi computes and stores the row. This can be iteratively computed as 
^n+i ^ ^ . ^n+i ^.^^^^ ^, obtained as a^^^ = Yl,Vk<^N(v^)^k3- 

This computation is enabled by all the nodes broadcasting their row to their 
neighbors. If m is the smallest integer such that > 0, this implies there 
is no path from Vi to Vj of length smaller than m. Therefore, the node 
i "discovers" node j at iteration m and at this instant, is a "new" node. 
Further, if G N{i), this also implies o^j^^ > and the values of a'^j for 
n > m + 1 are irrelevant from the perspective of computing d{vk,Vj). We 
therefore refrain from broadcasting for n > m time intervals. In other 
words, each node broadcasts the information about a new node it discovers 
only once. Note that in so doing, we do not actually compute at the ra*'^ 
iteration, but an estimate A" with the property that the smallest integer m 
for which a"]^ > is that for a"? > 0. The table used, acts as a reference to 
avoid transmitting duplicate information to its neighbors. Here, it appears 
that the memory required at each node will be equal to the number of nodes 
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in the partition, as all the nodes will eventually be discovered. We maintain 
that it suffices to store a node in the table for only two iterations. 

Theorem 4.2 A node Vi storing the information about the node Vj for two 
iterations, guarantees no duplicate information is broadcasted. 

Proof By contradiction. 

Duplicate information will be broadcast if a node Vi discovers a node vj at 
iterations m and m + t,t > 2. This means that there are two paths P = 
{j,Pi, . . .,pm-i,i) and Q = (j, gi, . . . , g^+t-i, i). At the m^^ iteration, node 
Vi will start a broadcast which propagates along Q in the reverse direction 
and meets the message coming along Q at node qm+ti = Qm+t-ti- 3^1 = 
(t — 1)/2 or (t — 2) /2, whichever is an integer such that (m+t— ti) — (m+ti) = 
1 or 2. The message from Vj would take the path Q only if node qm+u 
broadcast it, thus violating the rule because Vj is already in the table at that 
instant. | 

If n-max is the largest distance for which a node Vi discovers a new node, then 

we set f{Vi) = Umax- 

4.3.3 Diameter nodes 

Once / is computed, candidate diameter nodes are found by consensus for 
maximizing / on the network by a simple gossip algorithm. There are many 
algorithms in the literature for computing such aggregates on the network, 
for example [6]. The essence of such algorithms is that at each iteration, if a 
node "discovers" a new max value, it broadcasts this discovered value to all 
its neighbors. Similarly, the diameter nodes are obtained from the candidate 
nodes by consensus for a minimum of node IDs. 

4.3.4 Finding Boundary Nodes 

As the physical positions of the nodes do not change, we form a virtual seg- 
mentation by finding boundary nodes B = {hi} within a partition which stop 
messages from passing through. This effectively separates a given partition 
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At each Node i in the segment 
\\ Computing / 

\\ Initialization: Discover itself 
add Vi to table and broadcast to N{i) 
\\ run time 
at iteration n: 

\\check for new nodes discovered 
if found new nodes 

broadcast new nodes to N[i) 
add new nodes to table 
clear values of n-2 iteration 
else 

f{vi) = n 
stop. 

At each Node i inVx 

\\Initialization 

if Vi is a diameter node 

broadcast i to N{i). stop. 

(i will serve as the segment ID) 

else 

wait until reception 

if received two distinct IDs 

broadcast the lowest received ID to N{i). 

Vi = boundary node, 
wait one time interval 
if received two distinct IDs overall 
Vi — boundary node, 
stop. 

Table 2: Finding Diameter Nodes and Boundary Nodes 
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into two parts with non-intercommunicating nodes. For a set B to behave 
hke a boundary 0, it has to satisfy certain properties: 

Definition Let X = (yx,Ex) C G be a connected sub-graph. The set of 
nodes B is said to be a boundary in X, if and only if 3 two disjoint sets 
Vxi, C Vx such that there is a node 6j G 5 in any path (t>j, . . . ,Vj), 
where Vi G Xi and Vj G X2. Furthermore, Vxi U Vx2 D B = Vx- 

If every path from Vxi to Vx2 contains a boundary node, this means there 
is no path along which a message from Vxi can reach Vx2, thus virtually 
separating both. This justifies the above definition for the boundary. The 
boundary nodes identify their neighbors as belonging to Vxi or Vx2, and do 
not transmit messages from one to the other. 

To minimize the diameter of the resulting partitions (5*1 = Vxi U B and 
S2 = Vx2 U 5), we choose the boundary nodes to be equidistant from the 
determined diameter nodes. This will cause the boundary nodes to bisect 
the diameter of X. These equidistant nodes are obtained using a simple 
flooding algorithm which is presented in Table 14.3.31 The basic idea is to 
start a flood from both diameter nodes, and determine the boundary nodes 
where these floods meet. Every node will either belong to Si or to 5*2 since 
X is connected and therefore, SiU S2 = XiU X2U B = X . Let the diameter 
points be xi and X2, and let vi G Xi and V2 G X2. This implies vi and 
V2 received a single ID, ID(xi) and ID(a:;2) respectively. It follows that for 
any path p = (t>i, . . . , t>2), 3t>j G p such that Vi received both IDs and hence 
belong to B. This shows that the nodes obtained as in Table 14. 3. 3^ indeed 
satisfy the definition of the boundary. 

An additional and very important property that a boundary partitioning 
should satisfy, is that it should preserve the topology of the original entity 
(see Figure E]). Specifically, if X has no holes in its coverage, then neither 
5*1 nor 5*2 should; and if X has a hole, then it should be preserved in ei- 
ther one of the partitions. Theorem 14.31 shows that a sufficient condition for 
preserving the topology is the contractibility of the Rips complex obtained 
from an induced subgraph on the boundary nodes B. The Rips complex for 
X is obtained by taking all the cliques as simplices, and similarly for the 

^This definition of a boundary should not be confused witli tlie conventional notion, 
confounded with the closure of a graph or a region. The particular definition we are using 
will be clear from the context. 
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Figure 6: The partitioning on the left does not preserve topology whereas 
that on the right does. 

subgraphs induced on 5*1, 5*2 and B. 

Theorem 4.3 Let X be a Simplicial complex and A, B G X be sub complexes 
such that An B forms a boundary on the underlying graph. Then Hi{X) = 
Hi{A)®Hi{B) if An B is contractible, i.e., ifo(^nfi) = M andHi{Ar\B) = 
0. 

Proof For any simplicial complex X, and A, B G X , 3 the following exact 
sequence called the Meyer- Vietoris Exact sequence. 

HiiA n B) 4 HiiA) © H^{B) 4 H^{X) 
where and ip are linear operators. Now, 

Hi{A n B) = ^ img{(f)) = ^ kerijp) = Q ^ ip is injective since it is 
linear. Therefore, Hi{A) ® Hi{B) G Hi{X) 

Let c G ker{di{C^)) be a chain in the null space of the first boundary 
operator acting on the first chain space of X (c is a closed path), such that 
it contains vi G A and V2 G B. 361,62 € c, 61 7^ 62 such that 61,62 also 
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& A n B since A (1 B is a boundary. (See Figure [7]). Now, since A (1 B 
is connected, 3 a chain corresponding to the path bi — )■ 62- Consider the 
two chains corresponding to closed paths cl := (fi — )■ 61 b2 Vi) and 
c2 := (t>2 — )■ 62 — ^ &i — ^ V2). It immediately follows that cl + c2 = c. 
Therefore, any chain in ker{di{Ci)) can be expressed as a sum of chains in 
ker{di{Cf)) and fcer(9i(Cf )), Hi{X) C iJi(A) © H^^B). 
^H,{X)=H,{A)®H^{B). I 
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Figure 7: Figure illustrating a chain in X can be represented as a sum of 
chains in A and B where A,B (1 X and An B is connected 

The first part of the theorem states that no new holes are created by 
the partitioning, and the second states that all the holes are preserved. If 
the boundary nodes obtained by the algorithm given in Table 14.3.31 are not 
connected, we can form a tree by joining different connected components 
by a shortest path between them. This shortest path can be discovered by 
a simple flooding in the network originating at the connected components. 
The boundary obtained is also usually contractible, aside from one exception. 
As shown in Figure [8], this happens exactly when d{xi,Vi) = d{xi,V2) = 
d{x2,v^) = d{x2,Vi) in the given configuration, where Xi and X2 are the 
diameter nodes. In this case, all the nodes f 1, f2, fa, will be made boundary 
nodes. Note that we use the contractibility condition only to prove that no 
new holes are created, which is clearly also valid in this case. 

Once we found these boundary nodes, we can proceed and partition the 
network into two. We subsequently compute the Laplacian matrix, and check 
for rank deficiency as described in Section 14. 2[ 

4.4 Complexity Analysis 

Algorithms in sensor networks depend on variety of factors such as inter-node 
communication, in-node processing, memory requirements and run time. 
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•■12 _ V4 

Figure 8: Exception case when B 3 fi,z;2,V3,f4 is not contractible 

These carry different costs depending on the context, and the communi- 
cation cost is almost always dominant. We conduct a complexity analysis 
accounting for these pertinent points. The complexity also depends on the 
spatial arrangement of the nodes. For simplicity, we assume that the re- 
gion of deployment is convex. We also focus on an average cost per node 
rather than the cost of the entire network. Most of the complexity of the 
detection/localization algorithm (and therefore the bottlenecks) depends on 
three factors 1) Evaluating the function /, 2) Finding max(/) and 3) Find- 
ing spectral radius of the Laplacian. Furthermore, since each iteration of the 
partitioning procedure sees half of the surviving nodes removed, the average 
cost per node primarily depends on the first iteration. 

4.4.1 Communications 

For evaluating /, each node discovers every other node at some point and 
broadcasts the information to its neighbors. Each node broadcasts the dis- 
covery precisely once for every other node. As a result, the complexity per 
node for evaluating / is o{n), where n is the number of nodes. Evaluating 
the complexity for determining max(/) is rather peculiar since the behavior 
of the node depends on value of / at that node. Recall from Section 14.3.31 
titled "Diameter nodes", that if a node with a function value higher than 
any previously recorded value is encountered, this information is broadcast. 
The nodes with the highest value of / for example, never broadcast anything 
during this part of the algorithm. In order to evaluate the complexity in this 
case, we consider a simple case where nodes are deployed in a circular region. 
The radius of this circle will be p oc y/n. In this case, the nodes which lie on 
the circle of radius r, (see Figure [9]) will broadcast the discovery of exactly 
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Figure 9: Figure illustrating the simple case for assessing complexity 



p — r nodes which have a higher / value than previously recorded. 
The number of broadcasts over the entire network would end up as 



pHp+1) p(p+l)(2p+l) _p(p-l)(p + l) 



i(n-i) = i— ^- L-l-iL i.L_L L = L^L LIL L oc p'^ oc 

i=l 

The average complexity per node for evaluating max(/) is therefore o {y/n). 



Figure 10(a) shows a log — log relation between the number of nodes n and 
the number of memory words broadcast for finding the diameter nodes in the 
first partition. For each value of n, we averaged over 5 networks. A linear re- 
gression (line in blue) shows a slope of 0.9 ~ 1 confirming the dominant effect 
of evaluating / at o{n) cost. The complexity for finding the spectral radius 
of Li will be proportional to the mean degree of the nodes (as the number of 
values broadcast in the power iteration method will be proportional to the 
number of neighbors) and depend logarithmically on the ratio ai/a2 where 
«! and 0^2 are the first and second largest Eigen values. The difficulty of 
apriorily estimating these Eigen values for a random matrix, will complicate 
this ratio as an explicit function of n. We therefore provide some numerical 



results shown in Figure 10(b) This figure compares the total number of 
memory words broadcast for detecting a hole (evaluating spectral radius of 
Li and al — Li) in our algorithm, with those required for localizing a hole 
by an li norm minimization as presented in [30] . 
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(a) Complexity for finding 
diameter nodes 



40 60 
number of nodes 



(b) Complexity for localizing 
holes 



Figure 10: Complexity analysis for (a)finding diameter nodes (b)localizing 
holes, (b) compares the 'divide and conquer' method with li optimization 
in [30] 



4.4.2 Memory 

The only bottle neck in our algorithm for memory requirement lies in esti- 
mating the diameter points of the partition. At the i*'* time step during this 
process, a node discovers all the other nodes in the partition which are at a i— 
hop distance. The node keeps this information for 2 time steps, and deletes 
it. The iso-distance paths on the network from any node are on the order of 
o (v^); translating this into the memory requirement for the algorithm. 



4.4.3 Run Time 

First, note that the number of partitions required to converge on to a hole is 
related to the number of nodes as o(log(n)). The time required for finding 
both the diameter nodes and the boundary nodes is directly proportional 
to the diameter of the network, i.e., o{y/n). The number of iterations re- 
quired for the power iteration method (for computing spectral radius of the 
Laplander) to converge, similar to its communication cost, is of the order 
o (log (ai/a2)). In each iteration of the power method, finding the sum (for 
normalizing) requires 6 (nlog(?T,)) time [3] [2]. 



4.5 Simulation Examples 

Figure [11] shows the algorithm on a random network with 50 nodes. Figure 



11(a) shows the communication graph superimposed on the coverage area. 
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In the first partition, the boundary nodes are indicated by the red circles 
and the diameter nodes are indicated in black. The boundary nodes dictate 
where the partition occurs and as shown in Figure 11(b) , all the nodes in the 
partition which do not enclose a hole are no longer considered. An important 
point is that as the algorithm progresses, additional nodes are put to rest 
saving valuable power. IN the end, only the cycle closest to the coverage hole 
survives, providing a good indication of where the failure took place. 



5 Worm Hole Problem 

A worm hole attack is launched by two colluding external attackers which 
do not authenticate themselves as legitimate nodes to the network. When 
starting a wormhole attack, one attacker overhears packets at one point in 
the network, tunnels these packets through the wormhole link (external to 
the network) to another point in the network. This generates a false scenario 
that the original sender is in the neighborhood of the remote location. An 
example of a worm-hole attack is shown in Figure ??. In this Section, our aim 
is to first show the methodology of detecting, if such an attack is taking place, 
and if so, to locate the attack positions. By way of a simple observation, we 
show that the algorithm to find a coverage hole, may be extended to address 
this problem. 



5.1 Worm Hole Detection 

Because a worm hole links geographically separated positions in the network, 
it essentially creates a cycle in the network which cannot be a boundary of 



a 2-simplex. It thus creates a non-zero homology component. Figure 12(b) 
shows a network with a worm hole link, and the resulting deformation of the 
network structure which yields a cycle. We have already seen in Section H] 
how to localize this cycle. It is hence clear that a presence of a worm-hole 
in the network, would be followed by a localization of the shortest cycle it 
creates. 

The problem now reduces to identifying this cycle as a coverage hole or 
a wormhole. To this end, we formulate a simple algorithm shown in Table 
15.11 The central idea of our approach is based on the observation that a 
cycle surrounding a coverage hole lies on the surface in which the network is 
deployed, while a cycle created by a worm hole will not lie in this surface. 
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(a) Communication Graph (b) after partition 1 (c) after partition 2 

superimposed on the cover- 
age area 




(d) after partition 3 (e) after partition 4 (f ) after partition 5 



(g) after partition 6 (h) after partition 7 

Figure 11: Figure showing the sequence of surviving supgraph with each 
partition 
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(a) network grid with links caused because of 
a worm-hole 



(b) the same grid shown in 3d to respect dis- 
tant properties measured as hop distances 

Figure 12: Deformation in network structure because of a worm hole. The 
cycle created is shown in red 



Removing a cycle which surrounds a coverage hole will therefore divide the 
network into two components, while removing a cycle created by a worm-hole 



does not. Figure fT3] demonstrates this case. The network grid in Figure 13(a) 



shows a coverage hole and the shortest cycle surrounding it. This cycle is 
grown homologously, i.e., without creating any more loops or destroying any, 
in the network as shown in Figure 13(b) , When the nodes on this cycle along 
with their neighbors are removed from the network, the resulting network 
consists of two connected components as shown in Figure 13(c) Figures 
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(a) Cycle surround- (b) A cycle grown (c) Grown cycle re- 

ing a coverage hole in the network moved 




(d) Cycle created by the (e) The cycle grown in the (f) The grown cycle removed 
wormhole network 

Figure 13: The structural difference between the cycles created by a coverage 
hole and by a worm hole 



13(d), 13(e) and 13(f) show the same processes for a cycle created by a 



worm hole. As seen in 13(f), the resulting network is still a single connected 
component. The two steps 1) Growing the cycle and 2) removing the nodes 
along with their neighbors, are explained in detail in the following sections. 



Grow the current cycle to get a longer homologous cycle in the network. 
Remove the longer cycle along with its neighbors. 

if The above process creates an isolated component, 

The cycle corresponds to a coverage hole. 
else 

The cycle corresponds to a worm hole. 



Table 3: Algorithm for Detecting Worm Holes 



5.1.1 Growing the Cycles 

The algorithm described in Section H] yields the shortest (in sense of hop- 
distance) non-contractible cycle which, in case of a coverage hole, leads to 
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the boundary. Such a boundary will not serve our purpose, as removing a 
boundary will not partition the network into two. We hence have to grow 
this cycle "into" the network. The important properties which we have to 
abide by in the course of this cycle growth are 

• We should not break a cycle at any time 

• We should not introduce any additional loops into the cycle. 

A cycle which was originally surrounding a coverage hole will not do so after 
it is broken. If we further introduce loops into the cycle during the growing 
procedure, a cycle due to a worm hole will now be similar to that surround- 
ing a coverage hole. The above two properties are precisely captured by the 
idea of homologous chains. Two chains which belong to the same equivalence 
class in the homology space, are said to be homologous. Recall from Section 
13. 3[ the definition of homology groups as Hk{C^) = ker{dk)/Img{dk+i). If 
Ci, C2 G ker {di) belong to the same equivalent class, then ci — C2 G Img{d2), 
i.e., their difference can be written as sum of the boundaries of 2-simplices 
(triangles). To that end, we " homologously" grow the cycle by applying two 
elementary steps, both of which add a boundary of a 2-simplex to the exist- 
ing chain. 

Elementary Step 1 If two adjacent nodes Vi,V2 in the chain share a com- 
mon neighbor ^3, we then remove the edge (^1,^2) from the chain, and add 
the edges (^1,^3) and (^3,^2)- This step is shown in Figure [1] 




V3 



V2 



Figure 14: (a)before, (b)after Elementary Step 1, and (c)the difference of 
the chains before and after the step. Note that (c), which is Ci — C2, is a 
boundary of a 2-simplex. The edges shown in red are part of the chain. 

Elementary Step 2 If a node vi on the chain has two neighbors f 2, V3 which 
are also on this chain, and f2,f3 are neighbors, we then remove the edges 
(■^25^1)5 {vi,V3) and we add the edge (^2,^3). This step is shown in Figure 

US] 
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Figure 15: (a)before, (b)after Elementary Step 2, and (c)the difference of 
the chains before and after the step. Note that (c), which is Ci — C2, is a 
boundary of a 2-simplex. The edges shown in red are part of the chain. 



Figure 16: An edge (not in the cycle) "crosses" one on the cycle. In this 
case, removing just the nodes on the cycle would not separate the network 
into two components. 

5.1.2 Removing the cycle 

We saw in Figure [13] that removing the nodes on a cycle surrounding a cov- 
erage hole, along with their neighbors, yields two disconnected components. 
The reason for the insufficiency of removing just the nodes on the cycle, is 
that there might be two adjacent nodes in the graph whose edge "crosses" 
an edge on this cycle as shown in Figure [161 However, it can be shown that 
removing the nodes on a cycle along with their neighbors, will result in the 
required separation |26j. 




Note that until this point, we have made no assumption about the density 
of the nodes in the network. We can successfully identify whether the shortest 
non-contractible cycle identified in Section [H corresponds to a coverage hole 
or a worm hole, i.e., we have detected if there is a worm hole in the network. 
We have thus far restricted the location of a worm hole attack to a relatively 
small subgraph of the network (the shortest cycle). In the following section, 
we present an effective approach to precisely locate the worm hole in question. 

5.2 Worm Hole Localization 

In order to precisely locate a worm hole, we first closely examine its impact. 
Denote the colluding nodes in the attack as X and Y (Figure ??). As a 
result of this attack, a node in the vicinity of X considers all the nodes in 
the vicinity of Y as its neighbors. This also results in the formation of a 





35 



(a) vi,V2 not in vicinity of X and Y. A shortest path 
can be found in the network surrounding the nodes re- 
moved. 




(b) vi,V2 in vicinity of X and Y. Ahernative shortest 
path inchides all the nodes in the cycle 

Figure 17: Worm Hole Localization 

cycle. Observe that a simple way to undo the effect of a worm hole is to 
remove all the nodes in the vicinity of X and Y. Note that the algorithm 
in Section H] finds the shortest cycle, implying that there will exactly be two 
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Figure 18: A sparse network where the algorithm fails. The removal of nodes 
around Vi and V2 creates an isolated component in the network. 

nodes on this cycle which are in the vicinity of X or Y. In this light, we 
propose a simple algorithm given in Table 15.21 to localize the worm hole. 

for each adjacent pair vi,V2 in the cycle 

Remove the edge (fi,f2) and all neighbors of vi and V2 

except those on the cycle. 
Find the shortest path between vi and V2- 
if This shortest path coincides with the nodes on the cycle 

f 1 and V2 are in the vicinity of X and Y. 

Table 4: Algorithm for Localizing Worm Holes 

If Vi,V2 were indeed in the vicinity of X and Y, then removing all their 
neighbors would remove all the spurious links caused by the worm hole. In 
this case, the shortest path between Vi and V2 would be the rest of the cycle. 
If on the other hand, they were not in the vicinity of X or F, then they 
would find an alternative path in the network which surrounds the deleted 
nodes. The result of this algorithm is shown in Figure [T71 

We note that this algorithm assumes a minimal node density to properly 
perform. It should however be noted that the algorithm is most effective 
when the network is sufficiently dense. When the network is sparse, the 
removal of neighbors of Vi and V2 may eliminate all possible paths between 
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them, at the exception of those going through the hnks created by the worm 
hole. For example, in Figure [TK| since the network is very sparse, the removal 
of neighbors of vi and V2 creates an isolated component, which is only linked 
by the worm hole. Any path will therefore have to go through one of the 
links created by the worm hole. 

6 Conclusion 

In this work, we addressed using two specific problems; 1) Coverage Hole 
Localization and 2) An extended application to Worm Hole attack Localiza- 
tion. We have shown that topological analysis of a network provides us with 
substantial and ample information to assess its health, and requires minimal 
prior information. To that end, we have proposed an Algebraic Topological 
approach as an elegant and efficient avenue for extracting useful information. 
The formulation into an algebraic domain enables us to utilize extensive ex- 
isting tools to effectively address these problems. We have also, by way of the 
computational efficiency of our proposed approach, addressed a very crucial 
problem in sensor networks, namely that of prolonging the battery life of the 
nodes. 
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